Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data
نویسندگان
چکیده
Audio Word2Vec offers vector representations of fixed dimensionality for variable-length audio segments using Sequenceto-sequence Autoencoder (SA). These vector representations are shown to describe the sequential phonetic structures of the audio segments to a good degree, with real world applications such as query-by-example Spoken Term Detection (STD). This paper examines the capability of language transfer of Audio Word2Vec. We train SA from one language (source language) and use it to extract the vector representation of the audio segments of another language (target language). We found that SA can still catch phonetic structure from the audio segments of the target language if the source and target languages are similar. In query-by-example STD, we obtain the vector representations from the SA learned from a large amount of source language data, and found them surpass the representations from naive encoder and SA directly learned from a small amount of target language data. The result shows that it is possible to learn Audio Word2Vec model from highresource languages and use it on low-resource languages. This further expands the usability of Audio Word2Vec.
منابع مشابه
Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts
: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...
متن کاملAudio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder
The vector representations of fixed dimensionality for words (in text) offered by Word2Vec have been shown to be very useful in many application scenarios, in particular due to the semantic information they carry. This paper proposes a parallel version, the Audio Word2Vec. It offers the vector representations of fixed dimensionality for variable-length audio segments. These vector representatio...
متن کاملThe Effect of Gloss Type and Mode on Iranian EFL Learners’ Vocabulary Acquisition
Vocabulary is an important component of language proficiency which provides the basis for learners’ performance in other skills. But, since vocabulary learning seems to be so demanding, learners tend to forget newly-learnt words quite soon. In order to identify vocabulary learning conditions which can produce a more lasting effect, this study investigated the effect of three kinds of gloss cond...
متن کاملThe Efficacy of Audio Input Flooding Tasks on Learning Grammar: Uptake of Present Tense
This study sought to probe the role of input flooding through listening tasks on the uptake of simple present tense and the present progressive tense among pre - intermediate English a s Foreign Language ( EFL ) learners. To comply with the objective, an experimental design was adopted. 55 pre - intermediate learners participated in the study. They were randomly divided into one control group, ...
متن کاملThe Effect of Pre-teaching New Vocabulary Items via Audio-Visuals on Iranian EFL Learners’ Reading Comprehension Ability
This study aimed to investigate the effect of pre-teaching new vocabulary items via audio-visuals on Iranian EFL learners’ reading comprehension ability. The question this study tried to answer is if pre-teaching new vocabulary items via audio-visuals have any effect on Iranian EFL learners’ reading comprehension ability. To find the answer to the question, 30 intermediate level stu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.06519 شماره
صفحات -
تاریخ انتشار 2017